Introduction to Computer Vision: Plant Seedlings Classification¶
Problem Statement¶
Context¶
In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive. Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term. The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor, and this is where Artificial Intelligence can actually benefit the workers in this field, as the time and energy required to identify plant seedlings will be greatly shortened by the use of AI and Deep Learning. The ability to do so far more efficiently and even more effectively than experienced manual labor, could lead to better crop yields, the freeing up of human inolvement for higher-order agricultural decision making, and in the long term will result in more sustainable environmental practices in agriculture as well.
Objective¶
:The aim of this project is to Build a Convolutional Neural Netowrk (CNN) to classify plant seedlings into their respective categories.
Data Dictionary¶
The Aarhus University Signal Processing group, in collaboration with the University of Southern Denmark, has recently released a dataset containing images of unique plants belonging to 12 different species.
The dataset can be download from Olympus.
The data file names are:
- images.npy
- Label.csv
Due to the large volume of data, the images were converted to the images.npy file and the labels are also put into Labels.csv, so that you can work on the data/project seamlessly without having to worry about the high data volume.
The goal of the project is to create a classifier capable of determining a plant's species from an image.
List of Species
- Black-grass
- Charlock
- Cleavers
- Common Chickweed
- Common Wheat
- Fat Hen
- Loose Silky-bent
- Maize
- Scentless Mayweed
- Shepherds Purse
- Small-flowered Cranesbill
- Sugar beet
Note: Please use GPU runtime to execute the code efficiently¶
Steps and tasks that I will employ to carry out this project are as follows:
Import the libraries, load dataset, print shape of data, visualize the images in dataset.
Data Pre-processing:
i. Normalization.
ii. Gaussian Blurring.
iii. Visualize data after pre-processing.
Working on the data to make it compatible:
i. Convert labels to one-hot-vectors.
ii. Print the label for y_train.
iii. Split the dataset into training, testing, and validation set.
iv. First split images and labels into training and testing set with test_size = 0.3.
v. Then further split test data into test and validation set with test_size = 0.5
vi. Check the shape of data, Reshape data into shapes compatible with Keras models if it’s not already. If it’s already in the compatible shape, then comment in the notebook that it’s already in compatible shape.
- Building CNN:
i. Define layers.
ii. Set optimizer and loss function. (Use Adam optimizer and categorical crossentropy.)
iii. Fit and evaluate model and print confusion matrix. (10 Marks)
iv. Visualize predictions for x_test[2], x_test[3], x_test[33], x_test[36], x_test[59].
1. Import the libraries, load dataset, print shape of data, visualize the images in dataset.¶
Loading the dataset¶
from google.colab import drive
drive.mount('/content/drive/')
# import os
# os.chdir('drive/MyDrive/Computer_Vision_Project')
# # !ls "/content/drive/My Drive"
Mounted at /content/drive/
from pathlib import Path
target_dir = Path('drive/MyDrive/UTA - AIML/Computer_Vision_Project/')
#Changing directory to fetch project files
import os
os.chdir(target_dir)
Importing necessary libraries¶
import pandas as pd, numpy as np, matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import pyplot
%matplotlib inline
import warnings
from sklearn.exceptions import DataConversionWarning
warnings.filterwarnings(action='ignore', category=DataConversionWarning)
random_state = 42
import random
random.seed(random_state)
batch_size = 32
epochs = 500
# Create features and labels
from tensorflow.keras.applications.mobilenet import preprocess_input
import cv2
!ls
classifier_color.h5 classifier_color_weights.h5 classifier_grayscale.h5 classifier_grayscale_weights.h5 CV_Project_PresentationTemplate.pptx High_Code_Plant_Seedling_Classification.ipynb images.npy Labels.csv Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project.html Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project.ipynb
Loading the Images numpy array into a data array¶
np_load_old = np.load
# # modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle = True, **k)
data = np.load('images.npy')
#Checking the shape of the data
data.shape
(4750, 128, 128, 3)
ylabels = pd.read_csv('Labels.csv')
ylabels.head()
| Label | |
|---|---|
| 0 | Small-flowered Cranesbill |
| 1 | Small-flowered Cranesbill |
| 2 | Small-flowered Cranesbill |
| 3 | Small-flowered Cranesbill |
| 4 | Small-flowered Cranesbill |
ylabels.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 4750 entries, 0 to 4749 Data columns (total 1 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Label 4750 non-null object dtypes: object(1) memory usage: 37.2+ KB
Get all unique different categories into a list¶
categ = ['Black-grass', 'Charlock', 'Cleavers', 'Common Chickweed', 'Common wheat', 'Fat Hen', 'Loose Silky-bent',
'Maize', 'Scentless Mayweed', 'Shepherds Purse', 'Small-flowered Cranesbill', 'Sugar beet']
num_categ = len(categ)
num_categ
12
Observation:
- Total number of plant categories are 12 ie., o/p preds should be 12
- We have a total of 4750 plant images
- Each image is of shape 128 X 128
- As the number of channels is 3, images are in RGB (Red, Blue, Green)
Plant Categories Distribution¶
import seaborn as sns
import matplotlib.pyplot as plt
# Load the original Labels.csv to get the distribution for plotting
plot_labels_df = pd.read_csv('Labels.csv')
plt.rcParams["figure.figsize"] = (12,5)
sns.countplot(x=plot_labels_df['Label'], order = plot_labels_df['Label'].value_counts().index, palette='Greens_r', hue=plot_labels_df['Label'], legend=False)
plt.xlabel('Plant Categories')
plt.xticks(rotation=90)
plt.show()
Observation:
- "Loose Silky bent" plant samples are more compared to other categories
- Least plant samples are for "Common Wheat", "Maize"
Plotting different plant categories in 12x12 grid¶
#Importing ImageGrid to plot the plant sample images
from mpl_toolkits.axes_grid1 import ImageGrid
#defining a figure of size 12X12
fig = plt.figure(1, figsize=(num_categ, num_categ))
grid = ImageGrid(fig, 111, nrows_ncols=(num_categ, num_categ), axes_pad=0.05)
i = 0
index = ylabels.index
#Plottting 12 images from each plant category
for category_id, category in enumerate(categ):
condition = ylabels["Label"] == category
plant_indices = index[condition].tolist()
for j in range(0,12):
ax = grid[i]
ax.imshow(data[plant_indices[j]])
ax.axis('off')
if i % num_categ == num_categ - 1:
#printing the names for each caterogy
ax.text(200, 70, category, verticalalignment='center')
i += 1
plt.show();
2. Data Pre-processing:¶
#Importing cv2_imshow for displaying images
from google.colab.patches import cv2_imshow
Resizing and applying Gaussian Blur on a single image and plotting¶
# Resizing the image size to half ie., from 128X128 to 64X64
img = cv2.resize(data[1000],None,fx=0.50,fy=0.50)
#Applying Gaussian Blur
img_g = cv2.GaussianBlur(img,(3,3),0)
#Displaying preprocessed and original images
print("Resized to 50% and applied Gaussian Blurring with kernel size 3X3")
cv2_imshow(img_g)
print('\n')
print("Original Image of size 128X128")
cv2_imshow(data[1000])
Resized to 50% and applied Gaussian Blurring with kernel size 3X3
Original Image of size 128X128
Converting to HSV and applying mask for the background and focusing only on plant¶
# Convert to HSV image
hsvImg = cv2.cvtColor(img_g, cv2.COLOR_BGR2HSV)
cv2_imshow(hsvImg)
# Create mask (parameters - green color range)
lower_green = (25, 40, 50)
upper_green = (75, 255, 255)
mask = cv2.inRange(hsvImg, lower_green, upper_green)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11, 11))
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
# Create bool mask
bMask = mask > 0
# Apply the mask
clearImg = np.zeros_like(img, np.uint8) # Create empty image
clearImg[bMask] = img[bMask] # Apply boolean mask to the origin image
#Masked Image after removing the background
cv2_imshow(clearImg)
Applying Resize, Gaussian Blurr and Masking on All Images¶
data_copy = data.copy()
lower_green = (25, 40, 50)
upper_green = (75, 255, 255)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11, 11))
preprocessed_data_color = []
for img in data:
resize_img = cv2.resize(img,None,fx=0.50,fy=0.50)
Gblur_img = cv2.GaussianBlur(resize_img,(3,3),0)
hsv_img = cv2.cvtColor(Gblur_img, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_img, lower_green, upper_green)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
bMask = mask > 0
clearImg = np.zeros_like(resize_img, np.uint8) # Create empty image
clearImg[bMask] = resize_img[bMask] # Apply boolean mask to the original image
# clearImg1 = cv2.cvtColor(clearImg,cv2.COLOR_BGR2GRAY)
preprocessed_data_color.append(clearImg)
#Preprocessed all plant images
preprocessed_data_color = np.asarray(preprocessed_data_color)
Visualizing the preprocessed color plant images¶
from mpl_toolkits.axes_grid1 import ImageGrid
fig = plt.figure(1, figsize=(num_categ, num_categ))
grid = ImageGrid(fig, 111, nrows_ncols=(num_categ, num_categ), axes_pad=0.05)
i = 0
index = ylabels.index
for category_id, category in enumerate(categ):
condition = ylabels["Label"] == category
plant_indices = index[condition].tolist()
for j in range(0,12):
ax = grid[i]
# img = read_img(filepath, (224, 224))
# ax.imshow(img / 255.)
ax.imshow(preprocessed_data_color[plant_indices[j]]/255.)
# ax[i].set_title(ylabels.iloc[i].to_list(),fontsize=7,rotation=45)
ax.axis('off')
if i % num_categ == num_categ - 1:
ax.text(70, 30, category, verticalalignment='center')
i += 1
plt.show();
preprocessed_data_color.shape
(4750, 64, 64, 3)
Converting all color images to Grayscale images¶
preprocessed_data_gs = []
for img in preprocessed_data_color:
gi = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
preprocessed_data_gs.append(gi)
preprocessed_data_gs = np.asarray(preprocessed_data_gs)
preprocessed_data_gs.shape
(4750, 64, 64)
Visualizing the preprocessed Grayscale plant images
fig = plt.figure(1, figsize=(num_categ, num_categ))
grid = ImageGrid(fig, 111, nrows_ncols=(num_categ, num_categ), axes_pad=0.05)
i = 0
index = ylabels.index
for category_id, category in enumerate(categ):
condition = ylabels["Label"] == category
plant_indices = index[condition].tolist()
for j in range(0,12):
ax = grid[i]
# img = read_img(filepath, (224, 224))
# ax.imshow(img / 255.)
ax.imshow(preprocessed_data_gs[plant_indices[j]],cmap='gray',vmin=0, vmax=255)
# ax[i].set_title(ylabels.iloc[i].to_list(),fontsize=7,rotation=45)
ax.axis('off')
if i % num_categ == num_categ - 1:
ax.text(70, 30, category, verticalalignment='center')
i += 1
plt.show();
Converting Grayscale to Edge images using Sobel and Laplacian¶
sobel = cv2.Sobel(preprocessed_data_gs[0]*255, cv2.CV_64F,1,1,ksize=3)
laplacian = cv2.Laplacian(preprocessed_data_gs[0]*255, cv2.CV_64F)
cv2_imshow(sobel)
print("\n")
cv2_imshow(laplacian)
Converting all color images to Laplacian Edge detected images¶
preprocessed_data_Edge_Lap = []
for img in preprocessed_data_gs:
egi = cv2.Laplacian(img*255, cv2.CV_64F)
preprocessed_data_Edge_Lap.append(egi)
preprocessed_data_Edge_Lap = np.asarray(preprocessed_data_Edge_Lap)
preprocessed_data_Edge_Lap.shape
(4750, 64, 64)
Visualizing the preprocessed Edge plant images¶
fig = plt.figure(1, figsize=(num_categ, num_categ))
grid = ImageGrid(fig, 111, nrows_ncols=(num_categ, num_categ), axes_pad=0.05)
i = 0
index = ylabels.index
for category_id, category in enumerate(categ):
condition = ylabels["Label"] == category
plant_indices = index[condition].tolist()
for j in range(0,12):
ax = grid[i]
# img = read_img(filepath, (224, 224))
# ax.imshow(img / 255.)
ax.imshow(preprocessed_data_Edge_Lap[plant_indices[j]],cmap='gray',vmin=0, vmax=255)
# ax[i].set_title(ylabels.iloc[i].to_list(),fontsize=7,rotation=45)
ax.axis('off')
if i % num_categ == num_categ - 1:
ax.text(70, 30, category, verticalalignment='center')
i += 1
plt.show();
3. Working on the data to make it compatible:¶
Normalization for Images¶
preprocessed_data_gs = preprocessed_data_gs / 255.
preprocessed_data_color = preprocessed_data_color / 255.
preprocessed_data_Edge_Lap = preprocessed_data_Edge_Lap / 255.
Label Encoding and One-Hot encoding for Plant categories¶
ylabels['Label'] = ylabels['Label'].astype('category')
ylabels['Label'] = ylabels['Label'].cat.codes
ylabels.value_counts()
| count | |
|---|---|
| Label | |
| 6 | 654 |
| 3 | 611 |
| 8 | 516 |
| 10 | 496 |
| 5 | 475 |
| 1 | 390 |
| 11 | 385 |
| 2 | 287 |
| 0 | 263 |
| 9 | 231 |
| 7 | 221 |
| 4 | 221 |
from tensorflow.keras.utils import to_categorical
ylabels = to_categorical(ylabels, num_classes=12)
print("Shape of y_train:", ylabels.shape)
print("One value of y_train:", ylabels[0])
Shape of y_train: (4750, 12) One value of y_train: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
Split the dataset into training, validation and testing set¶
from sklearn.model_selection import train_test_split
val_split = 0.25
#1st split into train and test
X_train, X_test1, y_train, y_test1 = train_test_split(preprocessed_data_color, ylabels, test_size=0.30, stratify=ylabels,random_state = random_state)
#for my color image purpose and individual image pred.
X_train_color, X_test1_color, y_train_color, y_test1_color = train_test_split(data, ylabels, test_size=0.30, stratify=ylabels,random_state = random_state)
#2nd split into val and test
X_val, X_test, y_val, y_test = train_test_split(X_test1, y_test1, test_size=0.50, stratify=y_test1,random_state = random_state)
#for my color image purpose and individual image pred.
X_val_color, X_test_color, y_val_color, y_test_color = train_test_split(X_test1_color, y_test1, test_size=0.50, stratify=y_test1,random_state = random_state)
X = np.concatenate((X_train, X_test1))
y = np.concatenate((y_train, y_test1))
#Printing the shapes for all data splits
print("X_train shape: ", X_train.shape)
print("y_train shape: ", y_train.shape)
print("X_val shape: ", X_val.shape)
print("y_val shape: ", y_val.shape)
print("X_test shape: ", X_test.shape)
print("y_test shape: ", y_test.shape)
print("X shape: ", X.shape)
print("y shape: ", y.shape)
X_train shape: (3325, 64, 64, 3) y_train shape: (3325, 12) X_val shape: (712, 64, 64, 3) y_val shape: (712, 12) X_test shape: (713, 64, 64, 3) y_test shape: (713, 12) X shape: (4750, 64, 64, 3) y shape: (4750, 12)
Observation:
- X_train has 3325 plant images
- X_val has 712 plant images
- X_test has 713 plant images
- Plan images are in 64x64 shape with color channel
#Reshaping data into shapes compatible with Keras models
X_train = X_train.reshape(X_train.shape[0], 64, 64, 3)
X_val = X_val.reshape(X_val.shape[0], 64, 64, 3)
X_test = X_test.reshape(X_test.shape[0], 64, 64, 3)
#Converting type to float
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_val = X_val.astype('float32')
4. Building CNN:¶
First Trial---> Build CNN for preprocessed color Images¶
Using ImageDataGenerator for common data augmentation techniques¶
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(shear_range = 0.2,rotation_range=180, # randomly rotate images in the range
zoom_range = 0.1, # Randomly zoom image
width_shift_range=0.1, # randomly shift images horizontally
height_shift_range=0.1, # randomly shift images vertically
horizontal_flip=True, # randomly flip images horizontally
vertical_flip=True # randomly flip images vertically
)
# test_val_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow(X_train,y_train,batch_size=32,seed=random_state,shuffle=True)
# val_set = test_val_datagen.flow(X_val,y_val,batch_size=32,seed=random_state,shuffle=True)
# test_set = test_val_datagen.flow(X_test,y_test,batch_size=32,seed=random_state,shuffle=True)
Importing required libraries for CNN¶
import tensorflow as tf
from keras import layers
from tensorflow.keras.models import Sequential # Sequential groups a linear stack of layers into a tf.keras.Model.
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
#from tensorflow.keras.layers import Conv2D # This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs.
#from tensorflow.keras.layers import MaxPooling2D # Max pooling operation for 2D spatial data.
#from tensorflow.keras.layers import Flatten # Flattens the input. Does not affect the batch size.
#from tensorflow.keras.layers import Dense, Dropout # Dropout: Applies Dropout to the input.
from tensorflow.keras import callbacks
from tensorflow.keras.callbacks import EarlyStopping # Dense: Just your regular densely-connected NN layer.
from tensorflow.keras import optimizers
Creating a CNN model containing multiple layers for image processing and dense layer for classification¶
CNN Model layers:¶
- Convolutional input layer, 32 feature maps with a size of 3X3 and a * rectifier activation function
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Convolutional layer, 64 feature maps with a size of 3X3 and a rectifier activation function
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Convolutional layer, 64 feature maps with a size of 3X3 and a rectifier activation function.
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Flatten layer
- Fully connected or Dense layers (with 512 and 128 neurons) with Relu Act.
- Dropout layer to reduce overfitting or for regularization
- O/p layer with Softwax fun. to detect multiple categories
# Initialising the CNN classifier
classifier = Sequential()
# Add a Convolution layer with 32 kernels of 3X3 shape with activation function ReLU
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu', padding = 'same'))
#Adding Batch Normalization
classifier.add(layers.BatchNormalization())
# Add a Max Pooling layer of size 2X2
classifier.add(MaxPooling2D(pool_size = (2, 2),strides=2))
# Add another Convolution layer with 32 kernels of 3X3 shape with activation function ReLU
classifier.add(Conv2D(64, (3, 3), activation = 'relu', padding = 'same'))
classifier.add(layers.BatchNormalization())
classifier.add(MaxPooling2D(pool_size = (2, 2),strides=2))
# Add another Convolution layer with 32 kernels of 3X3 shape with activation function ReLU
classifier.add(Conv2D(64, (3, 3), activation = 'relu', padding = 'valid')) #no Padding
classifier.add(layers.BatchNormalization())
classifier.add(MaxPooling2D(pool_size = (2, 2),strides=2))
# Flattening the layer before fully connected layers
classifier.add(Flatten())
# Adding a fully connected layer with 512 neurons
classifier.add(layers.BatchNormalization())
classifier.add(Dense(units = 512, activation = 'relu'))
# Adding dropout with probability 0.2
classifier.add(Dropout(0.2))
# Adding a fully connected layer with 128 neurons
classifier.add(layers.BatchNormalization())
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.2))
# The final output layer with 10 neurons to predict the categorical classifcation
classifier.add(Dense(units = 12, activation = 'softmax'))
/usr/local/lib/python3.12/dist-packages/keras/src/layers/convolutional/base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
# Using Adam Optimizer and Categorical cross entropy as loss fun. and metrics improvement is Accuracy
# initiate Adam optimizer
adam_opt = optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
classifier.compile(optimizer = adam_opt, loss = 'categorical_crossentropy', metrics = ['accuracy'])
# printing summary
classifier.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d (Conv2D) │ (None, 64, 64, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 64, 64, 32) │ 128 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 32, 32, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 32, 32, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 32, 32, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 16, 16, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_2 (Conv2D) │ (None, 14, 14, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_2 │ (None, 14, 14, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_2 (MaxPooling2D) │ (None, 7, 7, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 3136) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_3 │ (None, 3136) │ 12,544 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 512) │ 1,606,144 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_4 │ (None, 512) │ 2,048 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 128) │ 65,664 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 12) │ 1,548 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 1,744,908 (6.66 MB)
Trainable params: 1,737,292 (6.63 MB)
Non-trainable params: 7,616 (29.75 KB)
# call back early stopping
callback_es = tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=20, min_delta=0.0001, restore_best_weights=True)
# Fitting the Classifier for Training set and validating for Validation set
batch_size = 32
epochs = 100
# Fit the mode
model1 = classifier.fit(training_set,
batch_size=batch_size,
epochs=epochs,
validation_data = (X_val,y_val),
shuffle=True,
callbacks = [callback_es])
Epoch 1/100
/usr/local/lib/python3.12/dist-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored. self._warn_if_super_not_called()
104/104 ━━━━━━━━━━━━━━━━━━━━ 19s 100ms/step - accuracy: 0.3141 - loss: 2.2250 - val_accuracy: 0.0604 - val_loss: 6.3510 Epoch 2/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.5176 - loss: 1.4149 - val_accuracy: 0.0604 - val_loss: 8.1603 Epoch 3/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.6014 - loss: 1.1769 - val_accuracy: 0.0604 - val_loss: 10.9719 Epoch 4/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.6446 - loss: 1.0627 - val_accuracy: 0.0604 - val_loss: 12.5510 Epoch 5/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.6922 - loss: 0.8398 - val_accuracy: 0.0857 - val_loss: 6.6873 Epoch 6/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.7207 - loss: 0.8024 - val_accuracy: 0.3652 - val_loss: 2.3034 Epoch 7/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.7416 - loss: 0.7445 - val_accuracy: 0.5702 - val_loss: 1.6824 Epoch 8/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.7509 - loss: 0.6792 - val_accuracy: 0.5829 - val_loss: 1.7383 Epoch 9/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.7681 - loss: 0.6656 - val_accuracy: 0.7233 - val_loss: 0.7743 Epoch 10/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.7782 - loss: 0.5961 - val_accuracy: 0.6952 - val_loss: 1.2633 Epoch 11/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.7960 - loss: 0.5768 - val_accuracy: 0.7626 - val_loss: 0.8217 Epoch 12/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.7967 - loss: 0.5455 - val_accuracy: 0.8006 - val_loss: 0.6849 Epoch 13/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8163 - loss: 0.5033 - val_accuracy: 0.7865 - val_loss: 0.8084 Epoch 14/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8251 - loss: 0.4962 - val_accuracy: 0.6559 - val_loss: 1.5253 Epoch 15/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8098 - loss: 0.5306 - val_accuracy: 0.6840 - val_loss: 1.1211 Epoch 16/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8272 - loss: 0.4728 - val_accuracy: 0.8160 - val_loss: 0.6363 Epoch 17/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.8342 - loss: 0.4763 - val_accuracy: 0.7458 - val_loss: 0.9197 Epoch 18/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8434 - loss: 0.4333 - val_accuracy: 0.8427 - val_loss: 0.5175 Epoch 19/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8435 - loss: 0.4464 - val_accuracy: 0.7261 - val_loss: 0.8935 Epoch 20/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 37ms/step - accuracy: 0.8370 - loss: 0.4485 - val_accuracy: 0.7837 - val_loss: 0.7000 Epoch 21/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8429 - loss: 0.4297 - val_accuracy: 0.8525 - val_loss: 0.4461 Epoch 22/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8560 - loss: 0.4105 - val_accuracy: 0.7669 - val_loss: 0.8069 Epoch 23/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.8539 - loss: 0.3776 - val_accuracy: 0.7037 - val_loss: 1.2355 Epoch 24/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 41ms/step - accuracy: 0.8559 - loss: 0.3875 - val_accuracy: 0.7767 - val_loss: 0.6977 Epoch 25/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8650 - loss: 0.3566 - val_accuracy: 0.8034 - val_loss: 0.6324 Epoch 26/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8386 - loss: 0.4034 - val_accuracy: 0.7837 - val_loss: 0.8182 Epoch 27/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.8616 - loss: 0.3537 - val_accuracy: 0.7360 - val_loss: 0.9744 Epoch 28/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8519 - loss: 0.3575 - val_accuracy: 0.8638 - val_loss: 0.4552 Epoch 29/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8812 - loss: 0.3434 - val_accuracy: 0.7640 - val_loss: 0.8009 Epoch 30/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8677 - loss: 0.3632 - val_accuracy: 0.8090 - val_loss: 0.7844 Epoch 31/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8632 - loss: 0.3821 - val_accuracy: 0.8174 - val_loss: 0.6783 Epoch 32/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8550 - loss: 0.3666 - val_accuracy: 0.7247 - val_loss: 1.0134 Epoch 33/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.8705 - loss: 0.3547 - val_accuracy: 0.7809 - val_loss: 0.7857 Epoch 34/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8809 - loss: 0.3132 - val_accuracy: 0.8329 - val_loss: 0.5814 Epoch 35/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8757 - loss: 0.3258 - val_accuracy: 0.7205 - val_loss: 1.1618 Epoch 36/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8946 - loss: 0.2876 - val_accuracy: 0.7907 - val_loss: 0.8328 Epoch 37/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8775 - loss: 0.3141 - val_accuracy: 0.7584 - val_loss: 1.2029 Epoch 38/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8869 - loss: 0.3235 - val_accuracy: 0.8652 - val_loss: 0.4410 Epoch 39/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8746 - loss: 0.3226 - val_accuracy: 0.7247 - val_loss: 1.1321 Epoch 40/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8931 - loss: 0.3025 - val_accuracy: 0.8455 - val_loss: 0.5378 Epoch 41/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8891 - loss: 0.3017 - val_accuracy: 0.8708 - val_loss: 0.4552 Epoch 42/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.8897 - loss: 0.2874 - val_accuracy: 0.7135 - val_loss: 1.1227 Epoch 43/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8856 - loss: 0.3061 - val_accuracy: 0.8567 - val_loss: 0.4864 Epoch 44/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8996 - loss: 0.2721 - val_accuracy: 0.7107 - val_loss: 1.3647 Epoch 45/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.8913 - loss: 0.2689 - val_accuracy: 0.8146 - val_loss: 0.6727 Epoch 46/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8943 - loss: 0.2678 - val_accuracy: 0.8441 - val_loss: 0.5240 Epoch 47/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8931 - loss: 0.2819 - val_accuracy: 0.7556 - val_loss: 1.2184 Epoch 48/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9017 - loss: 0.2569 - val_accuracy: 0.7711 - val_loss: 0.8451 Epoch 49/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8763 - loss: 0.3259 - val_accuracy: 0.8581 - val_loss: 0.5525 Epoch 50/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9091 - loss: 0.2432 - val_accuracy: 0.7879 - val_loss: 0.7011 Epoch 51/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8908 - loss: 0.2790 - val_accuracy: 0.8933 - val_loss: 0.3282 Epoch 52/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9032 - loss: 0.2832 - val_accuracy: 0.7500 - val_loss: 1.2420 Epoch 53/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8978 - loss: 0.2853 - val_accuracy: 0.7205 - val_loss: 0.8273 Epoch 54/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9026 - loss: 0.2494 - val_accuracy: 0.6826 - val_loss: 1.2588 Epoch 55/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8949 - loss: 0.2914 - val_accuracy: 0.8076 - val_loss: 0.6139 Epoch 56/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8954 - loss: 0.2647 - val_accuracy: 0.8188 - val_loss: 0.5398 Epoch 57/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9119 - loss: 0.2232 - val_accuracy: 0.8750 - val_loss: 0.3935 Epoch 58/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8978 - loss: 0.2662 - val_accuracy: 0.8553 - val_loss: 0.4317 Epoch 59/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9061 - loss: 0.2635 - val_accuracy: 0.7654 - val_loss: 0.9489 Epoch 60/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8918 - loss: 0.2528 - val_accuracy: 0.8090 - val_loss: 0.7342 Epoch 61/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8966 - loss: 0.2563 - val_accuracy: 0.8020 - val_loss: 0.8398 Epoch 62/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9074 - loss: 0.2381 - val_accuracy: 0.8188 - val_loss: 0.6775 Epoch 63/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9059 - loss: 0.2547 - val_accuracy: 0.7360 - val_loss: 1.0003 Epoch 64/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9031 - loss: 0.2419 - val_accuracy: 0.4031 - val_loss: 4.8289 Epoch 65/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.8846 - loss: 0.2881 - val_accuracy: 0.7388 - val_loss: 0.8142 Epoch 66/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9167 - loss: 0.2238 - val_accuracy: 0.8610 - val_loss: 0.5177 Epoch 67/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9130 - loss: 0.2274 - val_accuracy: 0.8315 - val_loss: 0.6225 Epoch 68/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.8979 - loss: 0.2639 - val_accuracy: 0.8385 - val_loss: 0.6782 Epoch 69/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 41ms/step - accuracy: 0.9017 - loss: 0.2708 - val_accuracy: 0.9017 - val_loss: 0.3417 Epoch 70/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 42ms/step - accuracy: 0.9133 - loss: 0.2296 - val_accuracy: 0.8132 - val_loss: 0.5690 Epoch 71/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9048 - loss: 0.2431 - val_accuracy: 0.8961 - val_loss: 0.3167 Epoch 72/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9218 - loss: 0.2113 - val_accuracy: 0.8553 - val_loss: 0.5270 Epoch 73/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 38ms/step - accuracy: 0.9112 - loss: 0.2249 - val_accuracy: 0.7978 - val_loss: 0.5405 Epoch 74/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9237 - loss: 0.1986 - val_accuracy: 0.8357 - val_loss: 0.5755 Epoch 75/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9139 - loss: 0.2225 - val_accuracy: 0.8287 - val_loss: 0.7471 Epoch 76/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9203 - loss: 0.2007 - val_accuracy: 0.8792 - val_loss: 0.4457 Epoch 77/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9180 - loss: 0.2147 - val_accuracy: 0.8806 - val_loss: 0.3610 Epoch 78/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9136 - loss: 0.2271 - val_accuracy: 0.9003 - val_loss: 0.3125 Epoch 79/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 42ms/step - accuracy: 0.9115 - loss: 0.2377 - val_accuracy: 0.8666 - val_loss: 0.4935 Epoch 80/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 41ms/step - accuracy: 0.9172 - loss: 0.2141 - val_accuracy: 0.8834 - val_loss: 0.3869 Epoch 81/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9128 - loss: 0.2150 - val_accuracy: 0.8890 - val_loss: 0.4715 Epoch 82/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 43ms/step - accuracy: 0.9135 - loss: 0.2310 - val_accuracy: 0.8553 - val_loss: 0.4607 Epoch 83/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 5s 44ms/step - accuracy: 0.9089 - loss: 0.2300 - val_accuracy: 0.8666 - val_loss: 0.4536 Epoch 84/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9126 - loss: 0.2380 - val_accuracy: 0.8511 - val_loss: 0.5909 Epoch 85/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9264 - loss: 0.2190 - val_accuracy: 0.8778 - val_loss: 0.4676 Epoch 86/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 42ms/step - accuracy: 0.9242 - loss: 0.1906 - val_accuracy: 0.8989 - val_loss: 0.4043 Epoch 87/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9280 - loss: 0.1980 - val_accuracy: 0.8792 - val_loss: 0.3832 Epoch 88/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 39ms/step - accuracy: 0.9137 - loss: 0.2329 - val_accuracy: 0.7907 - val_loss: 0.6463 Epoch 89/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 4s 40ms/step - accuracy: 0.9201 - loss: 0.2138 - val_accuracy: 0.8862 - val_loss: 0.4537
# Evaluating on Test data
classifier.evaluate(X_test,y_test)
23/23 ━━━━━━━━━━━━━━━━━━━━ 1s 41ms/step - accuracy: 0.9015 - loss: 0.2864
[0.31058722734451294, 0.9018232822418213]
# Best Model accuracy which has least loss
best_model_accuracy = model1.history['accuracy'][np.argmin(model1.history['loss'])]
best_model_accuracy
0.9254135489463806
Observation:
- Test Accuracy is 88.9%
- Validation model accuracy for least loss is 89.17%
#Printing out the Confusion Matrix
from sklearn.metrics import confusion_matrix
import itertools
def plot_confusion_matrix(cm, classes,
normalize=False,
title='Confusion matrix',
cmap=plt.cm.Greens):
fig = plt.figure(figsize=(10,10))
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=90)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
# Predict the values from the validation dataset
predY = classifier.predict(X_test)
predYClasses = np.argmax(predY, axis = 1)
trueY = np.argmax(y_test, axis = 1)
# confusion matrix
confusionMTX = confusion_matrix(trueY, predYClasses)
# plot the confusion matrix
plot_confusion_matrix(confusionMTX, classes = categ)
23/23 ━━━━━━━━━━━━━━━━━━━━ 1s 26ms/step
Observation:
Model have few wrong predictions for 'Losse Silky Bent' and 'black grass' --- 17 and 18
from sklearn.metrics import f1_score
print("AVR Class’s F1 score:-",f1_score(trueY, predYClasses, average='macro')) # macro, take the average of each class’s F-1 score:
print("Global +VE and -VE values:-",f1_score(trueY, predYClasses, average='micro')) #micro calculates positive and negative values globally
print("AVR F1 scores of number of instances in a class as weight:-",f1_score(trueY, predYClasses, average='weighted')) #F-1 scores are averaged by using the number of instances in a class as weight
print(f1_score(trueY, predYClasses, average=None))
AVR Class’s F1 score:- 0.8907484600480804 Global +VE and -VE values:- 0.9018232819074333 AVR F1 scores of number of instances in a class as weight:- 0.8998532269059444 [0.48571429 0.93693694 0.96551724 0.93785311 0.92753623 0.9044586 0.82524272 0.98507463 0.97368421 0.83333333 0.94871795 0.96491228]
observation:
Above are the F1 scores based on various averaging methods
from sklearn.metrics import classification_report
print(classification_report(trueY, predYClasses, target_names=categ))
precision recall f1-score support
Black-grass 0.55 0.44 0.49 39
Charlock 0.98 0.90 0.94 58
Cleavers 0.95 0.98 0.97 43
Common Chickweed 0.98 0.90 0.94 92
Common wheat 0.89 0.97 0.93 33
Fat Hen 0.84 0.99 0.90 72
Loose Silky-bent 0.79 0.87 0.83 98
Maize 0.97 1.00 0.99 33
Scentless Mayweed 1.00 0.95 0.97 78
Shepherds Purse 0.96 0.74 0.83 34
Small-flowered Cranesbill 0.91 0.99 0.95 75
Sugar beet 0.98 0.95 0.96 58
accuracy 0.90 713
macro avg 0.90 0.89 0.89 713
weighted avg 0.90 0.90 0.90 713
Observation:
Recall is very low for Black-grass
Precison is below .80 for Black grass and Loose Silky Bent
Even from Confusion matrix, we see model did not perform well for black grass
Other classes have better balane between precision and recall and a goof f1 score
Overall Accuracy is also great
from sklearn.metrics import multilabel_confusion_matrix
multilabel_confusion_matrix(trueY, predYClasses)
array([[[660, 14],
[ 22, 17]],
[[654, 1],
[ 6, 52]],
[[668, 2],
[ 1, 42]],
[[619, 2],
[ 9, 83]],
[[676, 4],
[ 1, 32]],
[[627, 14],
[ 1, 71]],
[[592, 23],
[ 13, 85]],
[[679, 1],
[ 0, 33]],
[[635, 0],
[ 4, 74]],
[[678, 1],
[ 9, 25]],
[[631, 7],
[ 1, 74]],
[[654, 1],
[ 3, 55]]])
Each plant category level TP, FP, FN, TN can be seen from above matrix
history_df = pd.DataFrame(model1.history)
history_df.head()
| accuracy | loss | val_accuracy | val_loss | |
|---|---|---|---|---|
| 0 | 0.390075 | 1.878688 | 0.060393 | 6.351015 |
| 1 | 0.538647 | 1.318181 | 0.060393 | 8.160256 |
| 2 | 0.625564 | 1.110506 | 0.060393 | 10.971898 |
| 3 | 0.666767 | 0.989446 | 0.060393 | 12.551048 |
| 4 | 0.688120 | 0.870809 | 0.085674 | 6.687286 |
#checking loss visualization
plt.title('Cross-entropy')
plt.plot(model1.history['loss'])
plt.plot(model1.history['val_loss']);
#checking the accuracy visualization
plt.title('Accuracy')
plt.plot(model1.history['accuracy'])
plt.plot(model1.history['val_accuracy']);
Observation:
Loss is decreasing and val loss is close to training loss
Accuracy of val set is also close to training accuracy
No overfitting or underfitting observerd based on the scores of val and testing sets
Visualize predictions for x_test[2], x_test[3], x_test[33], x_test[36], x_test[59]¶
pred_2 = np.argmax(classifier.predict(np.expand_dims(X_test[2],axis=0)),axis=1)
actual_2 = np.argmax(y_test[2])
print("Model predicted category for X_test 2 is: ", pred_2)
print("Actual Category for X_test 2 is: ",actual_2 )
print("Actual Category Name for X_test 2 is: ",categ[actual_2] )
cv2_imshow(X_test[2]*255)
print("\n")
cv2_imshow(X_test_color[2])
print("="*100)
pred_3 = np.argmax(classifier.predict(np.expand_dims(X_test[3],axis=0)),axis=1)
actual_3 = np.argmax(y_test[3])
print("Model predicted category for X_test 3 is: ", pred_3)
print("Actual Category for X_test 3 is: ",actual_3 )
print("Actual Category Name for X_test 3 is: ",categ[actual_3] )
cv2_imshow(X_test[3]*255)
print("\n")
cv2_imshow(X_test_color[3])
print("="*100)
pred_33 = np.argmax(classifier.predict(np.expand_dims(X_test[33],axis=0)),axis=1)
actual_33 = np.argmax(y_test[33])
print("Model predicted category for X_test 33 is: ", pred_33)
print("Actual Category for X_test 33 is: ",actual_33 )
print("Actual Category Name for X_test 33 is: ",categ[actual_33] )
cv2_imshow(X_test[33]*255)
print("\n")
cv2_imshow(X_test_color[33])
print("="*100)
pred_36 = np.argmax(classifier.predict(np.expand_dims(X_test[36],axis=0)),axis=1)
actual_36 = np.argmax(y_test[36])
print("Model predicted category for X_test 36 is: ", pred_36)
print("Actual Category for X_test 36 is: ",actual_36 )
print("Actual Category Name for X_test 36 is: ",categ[actual_36] )
cv2_imshow(X_test[36]*255)
print("\n")
cv2_imshow(X_test_color[36])
print("="*100)
pred_59 = np.argmax(classifier.predict(np.expand_dims(X_test[59],axis=0)),axis=1)
actual_59 = np.argmax(y_test[59])
print("Model predicted category for X_test 59 is: ", pred_59)
print("Actual Category for X_test 59 is: ",actual_59 )
print("Actual Category Name for X_test 59 is: ",categ[actual_59] )
cv2_imshow(X_test[59]*255)
print("\n")
cv2_imshow(X_test_color[59])
print("="*100)
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 756ms/step Model predicted category for X_test 2 is: [10] Actual Category for X_test 2 is: 10 Actual Category Name for X_test 2 is: Small-flowered Cranesbill
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step Model predicted category for X_test 3 is: [1] Actual Category for X_test 3 is: 1 Actual Category Name for X_test 3 is: Charlock
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step Model predicted category for X_test 33 is: [7] Actual Category for X_test 33 is: 7 Actual Category Name for X_test 33 is: Maize
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step Model predicted category for X_test 36 is: [6] Actual Category for X_test 36 is: 6 Actual Category Name for X_test 36 is: Loose Silky-bent
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 34ms/step Model predicted category for X_test 59 is: [2] Actual Category for X_test 59 is: 2 Actual Category Name for X_test 59 is: Cleavers
====================================================================================================
Observation:
All the above predictions made by model are equal to Actuals
classifier.save('./classifier_color.keras') # save classifier (model) and architecture to single file
Conclusion:
We have built a CNN-model to predict the class of a plant, which works quite well.
Increasing number of epochs and/or adding layers to a model can even increase the performance)
CNN with Batch Normalization, Maxpooling, dropouts + Dense layers is a good combination for image classification
Try 2 --> Creating CNN model for Grayscale Images¶
#Split the dataset into training, testing, and validation set
from sklearn.model_selection import train_test_split
val_split = 0.25
#1st split into train and test
X_train, X_test1, y_train, y_test1 = train_test_split(preprocessed_data_gs, ylabels, test_size=0.30, stratify=ylabels,random_state = random_state)
#for my color image purpose and individual image pred.
X_train_color, X_test1_color, y_train_color, y_test1_color = train_test_split(data, ylabels, test_size=0.30, stratify=ylabels,random_state = random_state)
#2nd split into val and test
X_val, X_test, y_val, y_test = train_test_split(X_test1, y_test1, test_size=0.50, stratify=y_test1,random_state = random_state)
#for my color image purpose and individual image pred.
X_val_color, X_test_color, y_val_color, y_test_color = train_test_split(X_test1_color, y_test1, test_size=0.50, stratify=y_test1,random_state = random_state)
X = np.concatenate((X_train, X_test1))
y = np.concatenate((y_train, y_test1))
#Printing the shapes for all data splits
print("X_train shape: ", X_train.shape)
print("y_train shape: ", y_train.shape)
print("X_val shape: ", X_val.shape)
print("y_val shape: ", y_val.shape)
print("X_test shape: ", X_test.shape)
print("y_test shape: ", y_test.shape)
print("X shape: ", X.shape)
print("y shape: ", y.shape)
X_train shape: (3325, 64, 64) y_train shape: (3325, 12) X_val shape: (712, 64, 64) y_val shape: (712, 12) X_test shape: (713, 64, 64) y_test shape: (713, 12) X shape: (4750, 64, 64) y shape: (4750, 12)
Observation:
X_train has 3325 plant images
X_val has 712 plant images
X_test has 713 plant images
Plan images are in 64X64 shape with grayscale
#Reshaping data into shapes compatible with Keras models
X_train = X_train.reshape(X_train.shape[0], 64, 64, 1)
X_val = X_val.reshape(X_val.shape[0], 64, 64, 1)
X_test = X_test.reshape(X_test.shape[0], 64, 64, 1)
#Converting type to float
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_val = X_val.astype('float32')
#Using ImageDataGenerator for common data augmentation techniques
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(shear_range = 0.2,rotation_range=180, # randomly rotate images in the range
zoom_range = 0.1, # Randomly zoom image
width_shift_range=0.1, # randomly shift images horizontally
height_shift_range=0.1, # randomly shift images vertically
horizontal_flip=True, # randomly flip images horizontally
vertical_flip=True # randomly flip images vertically
)
# test_val_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow(X_train,y_train,batch_size=batch_size,seed=random_state,shuffle=True)
# val_set = test_val_datagen.flow(X_val,y_val,batch_size=32,seed=random_state,shuffle=True)
# test_set = test_val_datagen.flow(X_test,y_test,batch_size=32,seed=random_state,shuffle=True)
Creating a CNN model containing multiple layers for image processing and dense layer for classification¶
CNN Model layers:¶
- Convolutional input layer, 32 feature maps with a size of 3X3 and a * rectifier activation function
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Convolutional layer, 64 feature maps with a size of 3X3 and a rectifier activation function
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Convolutional layer, 64 feature maps with a size of 3X3 and a rectifier activation function.
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Convolutional layer, 64 feature maps with a size of 3X3 and a rectifier activation function
- Batch Normalization
- Max Pool layer with size 2×2 and a stride of 2
- Flatten layer
- Fully connected or Dense layers (with 512 and 128 neurons) with Relu Act.
- Dropout layer to reduce overfitting or for regularization
- O/p layer with Softwax fun. to detect multiple categories
CNN Model building¶
# Initialising the CNN classifier1
classifier1 = Sequential()
# Add a Convolution layer with 32 kernels of 3X3 shape with activation function ReLU
classifier1.add(Conv2D(32, (3, 3), input_shape = (64, 64, 1), activation = 'relu', padding = 'same'))
#Adding Batch Normalization
classifier1.add(layers.BatchNormalization())
# Add a Max Pooling layer of size 2X2
classifier1.add(MaxPooling2D(pool_size = (2, 2),strides=2))
# Add another Convolution layer with 32 kernels of 3X3 shape with activation function ReLU
classifier1.add(Conv2D(64, (3, 3), activation = 'relu', padding = 'same'))
classifier1.add(layers.BatchNormalization())
classifier1.add(MaxPooling2D(pool_size = (2, 2),strides=2))
# Add another Convolution layer with 32 kernels of 3X3 shape with activation function ReLU
classifier1.add(Conv2D(64, (3, 3), activation = 'relu', padding = 'valid')) #no Padding
classifier1.add(layers.BatchNormalization())
classifier1.add(MaxPooling2D(pool_size = (2, 2),strides=2))
# Flattening the layer before fully connected layers
classifier1.add(Flatten())
# Adding a fully connected layer with 512 neurons
classifier1.add(layers.BatchNormalization())
classifier1.add(Dense(units = 512, activation = 'elu'))
# Adding dropout with probability 0.2
classifier1.add(Dropout(0.2))
# Adding a fully connected layer with 128 neurons
classifier1.add(layers.BatchNormalization())
classifier1.add(Dense(units = 256, activation = 'elu'))
# classifier1.add(Dropout(0.2))
# The final output layer with 10 neurons to predict the categorical classifcation
classifier1.add(Dense(units = 12, activation = 'softmax'))
/usr/local/lib/python3.12/dist-packages/keras/src/layers/convolutional/base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
#Printing the Summary
classifier1.summary()
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_3 (Conv2D) │ (None, 64, 64, 32) │ 320 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_5 │ (None, 64, 64, 32) │ 128 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_3 (MaxPooling2D) │ (None, 32, 32, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_4 (Conv2D) │ (None, 32, 32, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_6 │ (None, 32, 32, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_4 (MaxPooling2D) │ (None, 16, 16, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_5 (Conv2D) │ (None, 14, 14, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_7 │ (None, 14, 14, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_5 (MaxPooling2D) │ (None, 7, 7, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_1 (Flatten) │ (None, 3136) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_8 │ (None, 3136) │ 12,544 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 512) │ 1,606,144 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_2 (Dropout) │ (None, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_9 │ (None, 512) │ 2,048 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_4 (Dense) │ (None, 256) │ 131,328 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_5 (Dense) │ (None, 12) │ 3,084 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 1,811,532 (6.91 MB)
Trainable params: 1,803,916 (6.88 MB)
Non-trainable params: 7,616 (29.75 KB)
Compiling and Fitting the model¶
# initiate Adam optimizer
adam_opt = optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
classifier1.compile(optimizer = adam_opt, loss = 'categorical_crossentropy', metrics = ['accuracy'])
callback_es = tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=20, min_delta=0.001, restore_best_weights=True)
model2 = classifier1.fit(training_set,
batch_size=batch_size,
epochs=epochs,
validation_data = (X_val,y_val),
shuffle=True,
callbacks = [callback_es])
/usr/local/lib/python3.12/dist-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored. self._warn_if_super_not_called()
Epoch 1/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 14s 70ms/step - accuracy: 0.2702 - loss: 2.6078 - val_accuracy: 0.0604 - val_loss: 10.8253 Epoch 2/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.3993 - loss: 1.7397 - val_accuracy: 0.0604 - val_loss: 12.1732 Epoch 3/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.4683 - loss: 1.5309 - val_accuracy: 0.0604 - val_loss: 9.4147 Epoch 4/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.5237 - loss: 1.3785 - val_accuracy: 0.0604 - val_loss: 8.0313 Epoch 5/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.5476 - loss: 1.3082 - val_accuracy: 0.0815 - val_loss: 5.8779 Epoch 6/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.5553 - loss: 1.2172 - val_accuracy: 0.2570 - val_loss: 2.5072 Epoch 7/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.5875 - loss: 1.1286 - val_accuracy: 0.2725 - val_loss: 2.9968 Epoch 8/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.6104 - loss: 1.0921 - val_accuracy: 0.3357 - val_loss: 2.3472 Epoch 9/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.6271 - loss: 1.0426 - val_accuracy: 0.4565 - val_loss: 1.6236 Epoch 10/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.6262 - loss: 1.0165 - val_accuracy: 0.4410 - val_loss: 1.7604 Epoch 11/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.6597 - loss: 0.9644 - val_accuracy: 0.4508 - val_loss: 1.9534 Epoch 12/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.6419 - loss: 0.9641 - val_accuracy: 0.5885 - val_loss: 1.2548 Epoch 13/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.6667 - loss: 0.9146 - val_accuracy: 0.6362 - val_loss: 1.0817 Epoch 14/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.6742 - loss: 0.8837 - val_accuracy: 0.5323 - val_loss: 1.4469 Epoch 15/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7084 - loss: 0.8472 - val_accuracy: 0.5983 - val_loss: 1.2313 Epoch 16/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.6773 - loss: 0.8696 - val_accuracy: 0.3258 - val_loss: 3.4321 Epoch 17/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.6920 - loss: 0.8067 - val_accuracy: 0.6250 - val_loss: 1.1377 Epoch 18/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.6967 - loss: 0.8099 - val_accuracy: 0.2233 - val_loss: 3.2801 Epoch 19/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.6831 - loss: 0.8233 - val_accuracy: 0.5154 - val_loss: 1.7679 Epoch 20/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7202 - loss: 0.7617 - val_accuracy: 0.5913 - val_loss: 1.2948 Epoch 21/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7193 - loss: 0.7509 - val_accuracy: 0.6348 - val_loss: 1.1275 Epoch 22/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7165 - loss: 0.7379 - val_accuracy: 0.5323 - val_loss: 1.5922 Epoch 23/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7295 - loss: 0.7055 - val_accuracy: 0.7022 - val_loss: 0.9884 Epoch 24/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7424 - loss: 0.7020 - val_accuracy: 0.4677 - val_loss: 1.6631 Epoch 25/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7472 - loss: 0.6622 - val_accuracy: 0.6826 - val_loss: 0.9796 Epoch 26/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7659 - loss: 0.6388 - val_accuracy: 0.5871 - val_loss: 1.3256 Epoch 27/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7575 - loss: 0.6794 - val_accuracy: 0.4747 - val_loss: 1.8379 Epoch 28/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7551 - loss: 0.6624 - val_accuracy: 0.5997 - val_loss: 1.2736 Epoch 29/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7488 - loss: 0.6636 - val_accuracy: 0.6728 - val_loss: 0.9929 Epoch 30/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7490 - loss: 0.6300 - val_accuracy: 0.4649 - val_loss: 2.1159 Epoch 31/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7616 - loss: 0.6310 - val_accuracy: 0.4888 - val_loss: 1.9128 Epoch 32/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7557 - loss: 0.6390 - val_accuracy: 0.4958 - val_loss: 1.6674 Epoch 33/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7600 - loss: 0.6286 - val_accuracy: 0.6629 - val_loss: 1.0335 Epoch 34/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7609 - loss: 0.6569 - val_accuracy: 0.7149 - val_loss: 0.9007 Epoch 35/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7903 - loss: 0.5747 - val_accuracy: 0.6629 - val_loss: 0.9409 Epoch 36/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7808 - loss: 0.5693 - val_accuracy: 0.5379 - val_loss: 1.4866 Epoch 37/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7896 - loss: 0.5685 - val_accuracy: 0.4171 - val_loss: 2.3499 Epoch 38/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7767 - loss: 0.5800 - val_accuracy: 0.7233 - val_loss: 0.8084 Epoch 39/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7943 - loss: 0.5573 - val_accuracy: 0.6966 - val_loss: 0.9080 Epoch 40/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7876 - loss: 0.5785 - val_accuracy: 0.5885 - val_loss: 1.3425 Epoch 41/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7811 - loss: 0.5653 - val_accuracy: 0.7135 - val_loss: 0.8568 Epoch 42/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 21ms/step - accuracy: 0.7912 - loss: 0.5576 - val_accuracy: 0.4635 - val_loss: 1.8808 Epoch 43/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7913 - loss: 0.5568 - val_accuracy: 0.6854 - val_loss: 1.0603 Epoch 44/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7959 - loss: 0.5447 - val_accuracy: 0.7205 - val_loss: 0.8357 Epoch 45/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7948 - loss: 0.5468 - val_accuracy: 0.5548 - val_loss: 1.5590 Epoch 46/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8133 - loss: 0.5176 - val_accuracy: 0.6376 - val_loss: 1.4772 Epoch 47/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8060 - loss: 0.5054 - val_accuracy: 0.5772 - val_loss: 1.4822 Epoch 48/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7917 - loss: 0.5301 - val_accuracy: 0.7584 - val_loss: 0.7400 Epoch 49/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8165 - loss: 0.5239 - val_accuracy: 0.7725 - val_loss: 0.6828 Epoch 50/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7988 - loss: 0.5071 - val_accuracy: 0.7767 - val_loss: 0.6245 Epoch 51/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.7949 - loss: 0.5373 - val_accuracy: 0.6152 - val_loss: 1.3588 Epoch 52/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8225 - loss: 0.4762 - val_accuracy: 0.5758 - val_loss: 1.5258 Epoch 53/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8191 - loss: 0.4835 - val_accuracy: 0.7107 - val_loss: 0.9253 Epoch 54/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.7938 - loss: 0.5213 - val_accuracy: 0.5435 - val_loss: 1.6517 Epoch 55/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8296 - loss: 0.4417 - val_accuracy: 0.4452 - val_loss: 2.3391 Epoch 56/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8082 - loss: 0.5301 - val_accuracy: 0.7022 - val_loss: 0.9501 Epoch 57/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8037 - loss: 0.4969 - val_accuracy: 0.6180 - val_loss: 1.4339 Epoch 58/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8242 - loss: 0.4578 - val_accuracy: 0.6208 - val_loss: 1.1834 Epoch 59/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8171 - loss: 0.4604 - val_accuracy: 0.7851 - val_loss: 0.6699 Epoch 60/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 21ms/step - accuracy: 0.8035 - loss: 0.4826 - val_accuracy: 0.5913 - val_loss: 1.5074 Epoch 61/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8258 - loss: 0.4464 - val_accuracy: 0.7851 - val_loss: 0.6158 Epoch 62/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8319 - loss: 0.4424 - val_accuracy: 0.6882 - val_loss: 1.0180 Epoch 63/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8207 - loss: 0.4878 - val_accuracy: 0.6404 - val_loss: 1.3777 Epoch 64/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8336 - loss: 0.4263 - val_accuracy: 0.7893 - val_loss: 0.6992 Epoch 65/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8249 - loss: 0.4531 - val_accuracy: 0.7093 - val_loss: 0.9160 Epoch 66/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8494 - loss: 0.4039 - val_accuracy: 0.5801 - val_loss: 1.5139 Epoch 67/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8322 - loss: 0.4513 - val_accuracy: 0.6053 - val_loss: 1.1708 Epoch 68/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8159 - loss: 0.4764 - val_accuracy: 0.5337 - val_loss: 1.7741 Epoch 69/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8270 - loss: 0.4783 - val_accuracy: 0.6826 - val_loss: 1.0951 Epoch 70/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8192 - loss: 0.4698 - val_accuracy: 0.7430 - val_loss: 0.8599 Epoch 71/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8461 - loss: 0.4018 - val_accuracy: 0.6938 - val_loss: 1.0416 Epoch 72/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8301 - loss: 0.4295 - val_accuracy: 0.4663 - val_loss: 2.1174 Epoch 73/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8405 - loss: 0.4360 - val_accuracy: 0.7753 - val_loss: 0.7604 Epoch 74/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8419 - loss: 0.4053 - val_accuracy: 0.6348 - val_loss: 1.2683 Epoch 75/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8317 - loss: 0.4371 - val_accuracy: 0.5056 - val_loss: 1.9712 Epoch 76/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8366 - loss: 0.4250 - val_accuracy: 0.7781 - val_loss: 0.6872 Epoch 77/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8379 - loss: 0.3956 - val_accuracy: 0.6601 - val_loss: 1.2912 Epoch 78/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8418 - loss: 0.4063 - val_accuracy: 0.7949 - val_loss: 0.6487 Epoch 79/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8272 - loss: 0.4378 - val_accuracy: 0.7317 - val_loss: 0.8596 Epoch 80/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8449 - loss: 0.3966 - val_accuracy: 0.7528 - val_loss: 0.7643 Epoch 81/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8580 - loss: 0.3798 - val_accuracy: 0.7978 - val_loss: 0.5767 Epoch 82/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8352 - loss: 0.4090 - val_accuracy: 0.6924 - val_loss: 1.0380 Epoch 83/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8414 - loss: 0.3983 - val_accuracy: 0.7963 - val_loss: 0.6335 Epoch 84/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8592 - loss: 0.3677 - val_accuracy: 0.7879 - val_loss: 0.6851 Epoch 85/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8321 - loss: 0.4452 - val_accuracy: 0.7963 - val_loss: 0.6007 Epoch 86/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8637 - loss: 0.3761 - val_accuracy: 0.6475 - val_loss: 1.3143 Epoch 87/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8582 - loss: 0.3598 - val_accuracy: 0.4874 - val_loss: 2.3338 Epoch 88/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8478 - loss: 0.3891 - val_accuracy: 0.7360 - val_loss: 0.7315 Epoch 89/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8544 - loss: 0.3726 - val_accuracy: 0.6798 - val_loss: 1.1557 Epoch 90/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8532 - loss: 0.3655 - val_accuracy: 0.6994 - val_loss: 0.9071 Epoch 91/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8630 - loss: 0.3626 - val_accuracy: 0.7514 - val_loss: 0.8528 Epoch 92/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8605 - loss: 0.3631 - val_accuracy: 0.5337 - val_loss: 2.0876 Epoch 93/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8585 - loss: 0.3797 - val_accuracy: 0.7205 - val_loss: 0.8620 Epoch 94/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8564 - loss: 0.3694 - val_accuracy: 0.7402 - val_loss: 0.9430 Epoch 95/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 18ms/step - accuracy: 0.8443 - loss: 0.3928 - val_accuracy: 0.5843 - val_loss: 2.0882 Epoch 96/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8300 - loss: 0.4259 - val_accuracy: 0.5843 - val_loss: 2.3750 Epoch 97/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 20ms/step - accuracy: 0.8317 - loss: 0.4024 - val_accuracy: 0.6713 - val_loss: 1.2803 Epoch 98/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8579 - loss: 0.3572 - val_accuracy: 0.7865 - val_loss: 0.6234 Epoch 99/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8504 - loss: 0.3939 - val_accuracy: 0.6180 - val_loss: 1.3155 Epoch 100/100 104/104 ━━━━━━━━━━━━━━━━━━━━ 2s 19ms/step - accuracy: 0.8695 - loss: 0.3236 - val_accuracy: 0.6503 - val_loss: 1.2760
#Evaluating on Test data
classifier1.evaluate(X_test,y_test)
23/23 ━━━━━━━━━━━━━━━━━━━━ 1s 32ms/step - accuracy: 0.8084 - loss: 0.5529
[0.5465648770332336, 0.8078541159629822]
#Best model accuracy from all epochs which has least loss
best_model_accuracy = model2.history['accuracy'][np.argmin(model2.history['loss'])]
best_model_accuracy
0.865864634513855
Observation:
Test Accuracy is 79% Validation model accuracy for least loss is 82%
#Printing out the Confusion Matrix
from sklearn.metrics import confusion_matrix
import itertools
def plot_confusion_matrix(cm, classes,
normalize=False,
title='Confusion matrix',
cmap=plt.cm.Greens):
fig = plt.figure(figsize=(10,10))
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=90)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
# Predict the values from the validation dataset
predY = classifier1.predict(X_test)
predYClasses = np.argmax(predY, axis = 1)
trueY = np.argmax(y_test, axis = 1)
# confusion matrix
confusionMTX = confusion_matrix(trueY, predYClasses)
# plot the confusion matrix
plot_confusion_matrix(confusionMTX, classes = categ)
23/23 ━━━━━━━━━━━━━━━━━━━━ 1s 25ms/step
Observation:
Model did not perform well for:
Black grass and Loose Silky bent
Common wheat and loose silky bent
Shepherds purse and scentless mayweed
#Storing in a DF each epoch's loss and Accuracy for training and validation sets
history_df2 = pd.DataFrame(model2.history)
history_df2.head()
| accuracy | loss | val_accuracy | val_loss | |
|---|---|---|---|---|
| 0 | 0.327218 | 2.190298 | 0.060393 | 10.825313 |
| 1 | 0.435188 | 1.640836 | 0.060393 | 12.173200 |
| 2 | 0.474286 | 1.497100 | 0.060393 | 9.414684 |
| 3 | 0.523609 | 1.357162 | 0.060393 | 8.031293 |
| 4 | 0.553684 | 1.266253 | 0.081461 | 5.877923 |
#checking loss visualization
plt.title('Cross-entropy')
plt.plot(model2.history['loss'])
plt.plot(model2.history['val_loss']);
#checking the accuracy visualization
plt.title('Accuracy')
plt.plot(model2.history['accuracy'])
plt.plot(model2.history['val_accuracy']);
Visualize predictions for x_test[2], x_test[3], x_test[33], x_test[36], x_test[59]¶
pred_2 = np.argmax(classifier1.predict(np.expand_dims(X_test[2],axis=0)),axis=1)
actual_2 = np.argmax(y_test[2])
print("Model predicted category for X_test 2 is: ", pred_2)
print("Actual Category for X_test 2 is: ",actual_2 )
print("Actual Category Name for X_test 2 is: ",categ[actual_2] )
cv2_imshow(X_test[2]*255)
print("\n")
cv2_imshow(X_test_color[2])
print("="*100)
pred_3 = np.argmax(classifier1.predict(np.expand_dims(X_test[3],axis=0)),axis=1)
actual_3 = np.argmax(y_test[3])
print("Model predicted category for X_test 3 is: ", pred_3)
print("Actual Category for X_test 3 is: ",actual_3 )
print("Actual Category Name for X_test 3 is: ",categ[actual_3] )
cv2_imshow(X_test[3]*255)
print("\n")
cv2_imshow(X_test_color[3])
print("="*100)
pred_33 = np.argmax(classifier1.predict(np.expand_dims(X_test[33],axis=0)),axis=1)
actual_33 = np.argmax(y_test[33])
print("Model predicted category for X_test 33 is: ", pred_33)
print("Actual Category for X_test 33 is: ",actual_33 )
print("Actual Category Name for X_test 33 is: ",categ[actual_33] )
cv2_imshow(X_test[33]*255)
print("\n")
cv2_imshow(X_test_color[33])
print("="*100)
pred_36 = np.argmax(classifier1.predict(np.expand_dims(X_test[36],axis=0)),axis=1)
actual_36 = np.argmax(y_test[36])
print("Model predicted category for X_test 36 is: ", pred_36)
print("Actual Category for X_test 36 is: ",actual_36 )
print("Actual Category Name for X_test 36 is: ",categ[actual_36] )
cv2_imshow(X_test[36]*255)
print("\n")
cv2_imshow(X_test_color[36])
print("="*100)
pred_59 = np.argmax(classifier1.predict(np.expand_dims(X_test[59],axis=0)),axis=1)
actual_59 = np.argmax(y_test[59])
print("Model predicted category for X_test 59 is: ", pred_59)
print("Actual Category for X_test 59 is: ",actual_59 )
print("Actual Category Name for X_test 59 is: ",categ[actual_59] )
cv2_imshow(X_test[59]*255)
print("\n")
cv2_imshow(X_test_color[59])
print("="*100)
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 530ms/step Model predicted category for X_test 2 is: [10] Actual Category for X_test 2 is: 10 Actual Category Name for X_test 2 is: Small-flowered Cranesbill
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step Model predicted category for X_test 3 is: [1] Actual Category for X_test 3 is: 1 Actual Category Name for X_test 3 is: Charlock
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step Model predicted category for X_test 33 is: [7] Actual Category for X_test 33 is: 7 Actual Category Name for X_test 33 is: Maize
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step Model predicted category for X_test 36 is: [6] Actual Category for X_test 36 is: 6 Actual Category Name for X_test 36 is: Loose Silky-bent
==================================================================================================== 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step Model predicted category for X_test 59 is: [2] Actual Category for X_test 59 is: 2 Actual Category Name for X_test 59 is: Cleavers
====================================================================================================
Observation:
- All above observations are properly predicted by model
#Storing the model and its weights
classifier1.save('./classifier_grayscale.keras') # save classifier (model) and architecture to single file
classifier1.save_weights('./classifier_grayscale.weights.h5')
!ls
classifier_color.h5 classifier_color.keras classifier_color.weights.h5 classifier_color_weights.h5 classifier_grayscale.h5 classifier_grayscale.keras classifier_grayscale.weights.h5 classifier_grayscale_weights.h5 CV_Project_PresentationTemplate.pptx High_Code_Plant_Seedling_Classification.ipynb images.npy Labels.csv Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project.html Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project.ipynb
Conclusion:
We have built a CNN-model to predict the class of a plant, which works quite well.
Increasing number of epochs and/or adding layers to a model can even increase the performance
CNN with Batch Normalization, Max pooling, dropouts and Dense layers is a good combination for image classification
Data Overview¶
Exploratory Data Analysis¶
- How are these different category plant images different from each other?
- Is the dataset provided an imbalance? (Check with using bar plots)
Visual Characteristics Analysis¶
Let's meticulously examine the visual output for each of the 12 plant categories. As you go through each category, consider the following:
Black-grass:
- Leaf Shape: Generally narrow, elongated.
- Texture: Appears smooth.
- Size/Density: Often grows in dense clusters, upright.
- Color Nuances: Consistent green.
- Growth Pattern: Upright, grass-like.
Charlock:
- Leaf Shape: Wider, somewhat lobed or irregular edges.
- Texture: Appears slightly rough or textured.
- Size/Density: Medium size, less dense than grasses.
- Color Nuances: Bright to medium green.
- Growth Pattern: More spread out, forming a basal rosette in early stages.
Cleavers:
- Leaf Shape: Oval to lance-shaped, often in whorls around the stem.
- Texture: Distinctly hairy or 'sticky' appearance.
- Size/Density: Can be sprawling, medium density.
- Color Nuances: Medium green.
- Growth Pattern: Prostrate or climbing, spreading growth.
Common Chickweed:
- Leaf Shape: Small, oval to heart-shaped.
- Texture: Smooth to slightly hairy.
- Size/Density: Small, often dense, low-growing.
- Color Nuances: Light to medium green.
- Growth Pattern: Creeping, mat-forming.
Common Wheat:
- Leaf Shape: Very narrow, elongated, grass-like.
- Texture: Smooth.
- Size/Density: Upright, relatively uniform growth.
- Color Nuances: Consistent medium green.
- Growth Pattern: Upright, slender blades.
Fat Hen:
- Leaf Shape: Diamond or triangular shape, often with wavy or toothed edges.
- Texture: Appears powdery or mealy on the underside of leaves (though hard to see in images).
- Size/Density: Can grow quite large, bushy, medium density.
- Color Nuances: Bluish-green or grayish-green.
- Growth Pattern: Upright, branching.
Loose Silky-bent:
- Leaf Shape: Narrow, pointed, grass-like.
- Texture: Smooth, silky sheen.
- Size/Density: Upright, forms tufts.
- Color Nuances: Medium to dark green.
- Growth Pattern: Similar to other grasses but with a distinct, often looser, form.
Maize:
- Leaf Shape: Very broad, long, lance-shaped leaves.
- Texture: Smooth, prominent veins.
- Size/Density: Large, upright, low density of individual plants but can form dense rows.
- Color Nuances: Deep green.
- Growth Pattern: Strong upright stem, distinct leaf arrangement.
Scentless Mayweed:
- Leaf Shape: Finely dissected, fern-like leaves.
- Texture: Appears delicate, feathery.
- Size/Density: Bushy, medium density.
- Color Nuances: Bright green.
- Growth Pattern: Erect to spreading, much-branched.
Shepherds Purse:
- Leaf Shape: Basal rosette with deeply lobed leaves, stem leaves are smaller and less lobed.
- Texture: Slightly hairy.
- Size/Density: Small to medium, rosette-forming.
- Color Nuances: Light to medium green.
- Growth Pattern: Rosette at base, central flowering stem.
Small-flowered Cranesbill:
- Leaf Shape: Rounded, palmately lobed leaves.
- Texture: Hairy or fuzzy.
- Size/Density: Medium, often sprawling.
- Color Nuances: Medium green.
- Growth Pattern: Spreading, sometimes prostrate.
Sugar beet:
- Leaf Shape: Large, oval to heart-shaped, broad leaves.
- Texture: Smooth, slightly glossy.
- Size/Density: Large, forms a dense rosette.
- Color Nuances: Dark green.
- Growth Pattern: Large basal rosette.
Distinguishing Features Between Categories:¶
- Grass-like vs. Broadleaf: Black-grass, Common Wheat, and Loose Silky-bent are distinctly grass-like with narrow leaves, differentiating them from all other broadleaf species.
- Leaf Dissection: Scentless Mayweed has uniquely finely dissected leaves, unlike any other category.
- Leaf Texture: Cleavers and Small-flowered Cranesbill are noticeably hairy/fuzzy, while Fat Hen has a powdery/mealy appearance. Maize and Sugar beet have smooth, broad leaves.
- Growth Pattern: Shepherds Purse starts with a clear basal rosette, and Common Chickweed is a low-growing mat. Maize grows very upright with distinct broad leaves. Cleavers tends to sprawl.
- Color Nuances: Fat Hen often shows a distinct bluish-green or grayish-green hue compared to the brighter greens of many others.
Summarize Plant Image Differences¶
Distinguishing Features Between Categories¶
Based on the visualization of plant images across 12 categories, several key visual characteristics differentiate them, which a classification model would likely leverage:
Leaf Shape and Structure:
- Some plants like 'Charlock' and 'Scentless Mayweed' exhibit broad, often lobed or toothed leaves. 'Small-flowered Cranesbill' has distinctly palmate or deeply lobed leaves.
- In contrast, 'Loose Silky-bent' and 'Common Wheat' have very slender, grass-like leaves.
- 'Fat Hen' and 'Sugar beet' often show more ovate or diamond-shaped leaves, sometimes with a powdery appearance.
Leaf Color and Texture:
- While most plants are green, variations in shades (lighter vs. darker green) and the presence of veins or a glossy/dull surface can be observed. The pre-processing step with Gaussian blur and HSV masking helps in highlighting the plant structure by removing background noise, which is crucial for texture analysis.
- 'Black-grass' might have a darker, more uniform green, while others might show lighter hues.
Growth Pattern and Density:
- Some plants, like 'Common Chickweed', tend to grow in dense, sprawling mats, while others, such as 'Maize', exhibit a more upright, singular stem growth with larger, distinct leaves.
- 'Shepherds Purse' often presents a basal rosette of leaves before sending up a flower stalk.
Edge Characteristics (as highlighted by Laplacian Edge Detection):
- The Laplacian edge detection revealed distinct patterns. Plants with complex leaf structures (e.g., 'Small-flowered Cranesbill') produce more intricate edge maps, while simple, linear leaves (e.g., 'Loose Silky-bent') show fewer, straighter edges. This can be a strong feature for models to distinguish between broadleaf and grass-like plants.
Presence of Hairs or Spines:
- Though not explicitly visible in all scaled images, some species are known for trichomes (hairs) which could contribute to a textured appearance that models might pick up.
Overall Plant Silhouette/Boundary:
- The general outline or silhouette of the plant, especially after masking the background, provides a strong cue. For instance, the compact, often circular shape of some young seedlings versus the elongated or irregular shapes of others.
The classification model likely relies on a combination of these features, with convolutional layers extracting hierarchical representations of edges, textures, and shapes that are unique to each plant category. The data augmentation techniques applied (rotation, zoom, shifts, flips) help the model generalize these features regardless of the plant's orientation or slight variations in appearance.
Summary:¶
Q&A¶
The task's implicit question was to analyze the visual characteristics of plant seedling categories and provide a detailed summary of how they are visually distinct, highlighting key features for classification. This was successfully addressed by meticulously examining and categorizing the visual features for each of the 12 plant types and then summarizing their distinguishing characteristics.
Data Analysis Key Findings¶
- Comprehensive Visual Differentiation: Detailed visual characteristics were identified for 12 plant categories, including specific leaf shapes (e.g., narrow and elongated for Black-grass, finely dissected for Scentless Mayweed, palmate for Small-flowered Cranesbill), textures (e.g., hairy for Cleavers, smooth for Maize, powdery for Fat Hen), sizes/densities, color nuances (e.g., Fat Hen's distinct bluish-green), and growth patterns (e.g., Common Chickweed's mat-forming, Shepherds Purse's basal rosette, Maize's upright growth).
- Key Distinguishing Feature Categories: The analysis consolidated the differences into several key categories valuable for classification models:
- Leaf Shape and Structure: Ranging from slender, grass-like leaves (e.g., 'Loose Silky-bent', 'Common Wheat') to broad, lobed, or intricate leaves (e.g., 'Charlock', 'Scentless Mayweed', 'Small-flowered Cranesbill').
- Leaf Color and Texture: Variations in green hues and surface properties were noted, with pre-processing techniques like Gaussian blur and HSV masking identified as crucial for highlighting texture.
- Growth Pattern and Density: Distinctive patterns were observed, such as dense, sprawling growth versus upright singular stems or basal rosettes.
- Edge Characteristics: Laplacian edge detection revealed intricate edge maps for complex leaves and simpler, straighter edges for linear leaves, serving as a strong feature to differentiate broadleaf from grass-like plants.
- Overall Plant Silhouette/Boundary: The general outline of the plant after background removal was identified as a significant visual cue.
Insights or Next Steps¶
- Plant seedling classification relies on a combination of diverse visual features, from macroscopic growth patterns to microscopic leaf textures and edge complexities.
- Further analysis could involve applying more advanced image processing techniques to quantify these identified visual features, creating robust numerical representations for training machine learning models.
Summary:¶
Q&A¶
Yes, the dataset is imbalanced. "Loose Silky-bent" is the most represented plant with 654 samples, while "Common Wheat" and "Maize" are the least represented, each with only 221 samples. This indicates a significant uneven distribution across plant categories.
Data Analysis Key Findings¶
- The dataset contains 12 distinct plant categories.
- The most frequently occurring plant category, "Loose Silky-bent" (label 6), has 654 samples.
- The least frequently occurring plant categories, "Common Wheat" (label 7) and "Maize" (label 4), each have 221 samples.
- The sample counts for other categories range between these extremes, confirming a substantial class imbalance.
Insights or Next Steps¶
- The significant class imbalance (ranging from 221 to 654 samples per class) could negatively impact the performance of machine learning models, especially for minority classes.
- Consider implementing techniques such as oversampling (e.g., SMOTE), undersampling, or using class weights during model training to address the detected imbalance and improve model generalization.
!pip install nbconvert
Requirement already satisfied: nbconvert in /usr/local/lib/python3.12/dist-packages (7.16.6) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (4.13.5) Requirement already satisfied: bleach!=5.0.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (6.3.0) Requirement already satisfied: defusedxml in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.7.1) Requirement already satisfied: jinja2>=3.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.6) Requirement already satisfied: jupyter-core>=4.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.9.1) Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.3.0) Requirement already satisfied: markupsafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.0.3) Requirement already satisfied: mistune<4,>=2.0.3 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.4) Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.10.2) Requirement already satisfied: nbformat>=5.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.10.4) Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from nbconvert) (25.0) Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (1.5.1) Requirement already satisfied: pygments>=2.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (2.19.2) Requirement already satisfied: traitlets>=5.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.7.1) Requirement already satisfied: webencodings in /usr/local/lib/python3.12/dist-packages (from bleach!=5.0.0->bleach[css]!=5.0.0->nbconvert) (0.5.1) Requirement already satisfied: tinycss2<1.5,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (1.4.0) Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.12/dist-packages (from jupyter-core>=4.7->nbconvert) (4.5.0) Requirement already satisfied: jupyter-client>=6.1.12 in /usr/local/lib/python3.12/dist-packages (from nbclient>=0.5.0->nbconvert) (7.4.9) Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (2.21.2) Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (4.25.1) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (2.8) Requirement already satisfied: typing-extensions>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (4.15.0) Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (25.4.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (2025.9.1) Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.37.0) Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.28.0) Requirement already satisfied: entrypoints in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (0.4) Requirement already satisfied: nest-asyncio>=1.5.4 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.6.0) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.9.0.post0) Requirement already satisfied: pyzmq>=23.0 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (26.2.1) Requirement already satisfied: tornado>=6.2 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.5.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.17.0)
%%shell
jupyter nbconvert --to html '/content/drive/My Drive/UTA - AIML/Computer_Vision_Project/Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project_a.ipynb'
[NbConvertApp] WARNING | pattern '/content/drive/My Drive/Computer_Vision_Projectt/Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project_a.ipynb' matched no files
This application is used to convert notebook files (*.ipynb)
to various other formats.
WARNING: THE COMMANDLINE INTERFACE MAY CHANGE IN FUTURE RELEASES.
Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
<cmd> --help-all
--debug
set log level to logging.DEBUG (maximize logging output)
Equivalent to: [--Application.log_level=10]
--show-config
Show the application's configuration (human-readable format)
Equivalent to: [--Application.show_config=True]
--show-config-json
Show the application's configuration (json format)
Equivalent to: [--Application.show_config_json=True]
--generate-config
generate default config file
Equivalent to: [--JupyterApp.generate_config=True]
-y
Answer yes to any questions instead of prompting.
Equivalent to: [--JupyterApp.answer_yes=True]
--execute
Execute the notebook prior to export.
Equivalent to: [--ExecutePreprocessor.enabled=True]
--allow-errors
Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
Equivalent to: [--ExecutePreprocessor.allow_errors=True]
--stdin
read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
Equivalent to: [--NbConvertApp.from_stdin=True]
--stdout
Write notebook output to stdout instead of files.
Equivalent to: [--NbConvertApp.writer_class=StdoutWriter]
--inplace
Run nbconvert in place, overwriting the existing notebook (only
relevant when converting to notebook format)
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory=]
--clear-output
Clear output of current file and save in place,
overwriting the existing notebook.
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --ClearOutputPreprocessor.enabled=True]
--coalesce-streams
Coalesce consecutive stdout and stderr outputs into one stream (within each cell).
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --CoalesceStreamsPreprocessor.enabled=True]
--no-prompt
Exclude input and output prompts from converted document.
Equivalent to: [--TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True]
--no-input
Exclude input cells and output prompts from converted document.
This mode is ideal for generating code-free reports.
Equivalent to: [--TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=True --TemplateExporter.exclude_input_prompt=True]
--allow-chromium-download
Whether to allow downloading chromium if no suitable version is found on the system.
Equivalent to: [--WebPDFExporter.allow_chromium_download=True]
--disable-chromium-sandbox
Disable chromium security sandbox when converting to PDF..
Equivalent to: [--WebPDFExporter.disable_sandbox=True]
--show-input
Shows code input. This flag is only useful for dejavu users.
Equivalent to: [--TemplateExporter.exclude_input=False]
--embed-images
Embed the images as base64 dataurls in the output. This flag is only useful for the HTML/WebPDF/Slides exports.
Equivalent to: [--HTMLExporter.embed_images=True]
--sanitize-html
Whether the HTML in Markdown cells and cell outputs should be sanitized..
Equivalent to: [--HTMLExporter.sanitize_html=True]
--log-level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
Equivalent to: [--Application.log_level]
--config=<Unicode>
Full path of a config file.
Default: ''
Equivalent to: [--JupyterApp.config_file]
--to=<Unicode>
The export format to be used, either one of the built-in formats
['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'qtpdf', 'qtpng', 'rst', 'script', 'slides', 'webpdf']
or a dotted object name that represents the import path for an
``Exporter`` class
Default: ''
Equivalent to: [--NbConvertApp.export_format]
--template=<Unicode>
Name of the template to use
Default: ''
Equivalent to: [--TemplateExporter.template_name]
--template-file=<Unicode>
Name of the template file to use
Default: None
Equivalent to: [--TemplateExporter.template_file]
--theme=<Unicode>
Template specific theme(e.g. the name of a JupyterLab CSS theme distributed
as prebuilt extension for the lab template)
Default: 'light'
Equivalent to: [--HTMLExporter.theme]
--sanitize_html=<Bool>
Whether the HTML in Markdown cells and cell outputs should be sanitized.This
should be set to True by nbviewer or similar tools.
Default: False
Equivalent to: [--HTMLExporter.sanitize_html]
--writer=<DottedObjectName>
Writer class used to write the
results of the conversion
Default: 'FilesWriter'
Equivalent to: [--NbConvertApp.writer_class]
--post=<DottedOrNone>
PostProcessor class used to write the
results of the conversion
Default: ''
Equivalent to: [--NbConvertApp.postprocessor_class]
--output=<Unicode>
Overwrite base name use for output files.
Supports pattern replacements '{notebook_name}'.
Default: '{notebook_name}'
Equivalent to: [--NbConvertApp.output_base]
--output-dir=<Unicode>
Directory to write output(s) to. Defaults
to output to the directory of each notebook. To recover
previous default behaviour (outputting to the current
working directory) use . as the flag value.
Default: ''
Equivalent to: [--FilesWriter.build_directory]
--reveal-prefix=<Unicode>
The URL prefix for reveal.js (version 3.x).
This defaults to the reveal CDN, but can be any url pointing to a copy
of reveal.js.
For speaker notes to work, this must be a relative path to a local
copy of reveal.js: e.g., "reveal.js".
If a relative path is given, it must be a subdirectory of the
current directory (from which the server is run).
See the usage documentation
(https://nbconvert.readthedocs.io/en/latest/usage.html#reveal-js-html-slideshow)
for more details.
Default: ''
Equivalent to: [--SlidesExporter.reveal_url_prefix]
--nbformat=<Enum>
The nbformat version to write.
Use this to downgrade notebooks.
Choices: any of [1, 2, 3, 4]
Default: 4
Equivalent to: [--NotebookExporter.nbformat_version]
Examples
--------
The simplest way to use nbconvert is
> jupyter nbconvert mynotebook.ipynb --to html
Options include ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'qtpdf', 'qtpng', 'rst', 'script', 'slides', 'webpdf'].
> jupyter nbconvert --to latex mynotebook.ipynb
Both HTML and LaTeX support multiple output templates. LaTeX includes
'base', 'article' and 'report'. HTML includes 'basic', 'lab' and
'classic'. You can specify the flavor of the format used.
> jupyter nbconvert --to html --template lab mynotebook.ipynb
You can also pipe the output to stdout, rather than a file
> jupyter nbconvert mynotebook.ipynb --stdout
PDF is generated via latex
> jupyter nbconvert mynotebook.ipynb --to pdf
You can get (and serve) a Reveal.js-powered slideshow
> jupyter nbconvert myslides.ipynb --to slides --post serve
Multiple notebooks can be given at the command line in a couple of
different ways:
> jupyter nbconvert notebook*.ipynb
> jupyter nbconvert notebook1.ipynb notebook2.ipynb
or you can specify the notebooks list in a config file, containing::
c.NbConvertApp.notebooks = ["my_notebook.ipynb"]
> jupyter nbconvert --config mycfg.py
To see all available configurables, use `--help-all`.
--------------------------------------------------------------------------- CalledProcessError Traceback (most recent call last) /tmp/ipython-input-1129907439.py in <cell line: 0>() ----> 1 get_ipython().run_cell_magic('shell', '', "jupyter nbconvert --to html '/content/drive/My Drive/Computer_Vision_Projectt/Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project_a.ipynb'\n") /usr/local/lib/python3.12/dist-packages/google/colab/_shell.py in run_cell_magic(self, magic_name, line, cell) 274 if line and not cell: 275 cell = ' ' --> 276 return super().run_cell_magic(magic_name, line, cell) 277 278 /usr/local/lib/python3.12/dist-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell) 2471 with self.builtin_trap: 2472 args = (magic_arg_s, cell) -> 2473 result = fn(*args, **kwargs) 2474 return result 2475 /usr/local/lib/python3.12/dist-packages/google/colab/_system_commands.py in _shell_cell_magic(args, cmd) 110 result = _run_command(cmd, clear_streamed_output=False) 111 if not parsed_args.ignore_errors: --> 112 result.check_returncode() 113 return result 114 /usr/local/lib/python3.12/dist-packages/google/colab/_system_commands.py in check_returncode(self) 135 def check_returncode(self): 136 if self.returncode: --> 137 raise subprocess.CalledProcessError( 138 returncode=self.returncode, cmd=self.args, output=self.output 139 ) CalledProcessError: Command 'jupyter nbconvert --to html '/content/drive/My Drive/Computer_Vision_Projectt/Samson_Akomolafe_High_Code_Plant_Seedling_Classification_Project_a.ipynb' ' returned non-zero exit status 255.